# Set the folder (path) that contains this R file as the working directory
#dir <- dirname(rstudioapi::getActiveDocumentContext()$path)
#setwd(dir)
library(igraph)
## Warning: package 'igraph' was built under R version 3.5.3
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
library(data.table)
## Warning: package 'data.table' was built under R version 3.5.3
library(stringr)
## Warning: package 'stringr' was built under R version 3.5.3
library(igraphdata)
## Warning: package 'igraphdata' was built under R version 3.5.3
openflights <- fread("openflights.dl.tsv", header = F)
colnames(openflights) <- c("from", "to", "weight")
metadata <- read.csv("airports.dat.csv", header = T)
metadata <- metadata
flights <- graph.data.frame(as.data.frame(openflights))
head(openflights)
## from to weight
## 1: 1 5 1
## 2: 2 4 1
## 3: 2 5 1
## 4: 2 6 2
## 5: 2 5430 1
## 6: 3 2 1
summary(flights)
## IGRAPH ca91f15 DNW- 2939 30501 --
## + attr: name (v/c), weight (e/n)
The summary of this graph describes the graph as directed named and weighted. The graph has 2,939 nodes with 30,501 edges. The name attribute is a vertex of character type, while the weight attribute is an edge level numeric attribute.
Since our graph is a directed graph, we will look at the total degree distribution considering the graph as undirected. Then we will look at the in-degree and out-degree distributions to better understand the graph at hand, also looking at the mean and standard deviation. ### Directed Flights Graph total-degree distribution
deg <- degree(as.undirected(flights), mode="total")
hist(deg, main="Histogram of Total node degree",xlim=c(0,50), ylim=c(0,1500),breaks = 100)
deg.dist <- degree_distribution(as.undirected(flights), cumulative=T, mode="total")
plot( x=0:max(deg), y=1-deg.dist, pch=19, cex=1.2, col="orange",
xlab="Degree", ylab="Cumulative Frequency for total degree", xlim = c(0,100))
sprintf("mean: %f",mean(deg))
## [1] "mean: 10.668255"
sprintf("sd: %f",sd(deg))
## [1] "sd: 21.929753"
deg <- degree(as.directed(flights), mode="in")
hist(deg, main="Histogram of Total node degree",xlim=c(0,50), ylim=c(0,1500),breaks = 100)
deg.dist <- degree_distribution(flights, cumulative=T, mode="in")
plot( x=0:max(deg), y=1-deg.dist, pch=19, cex=1.2, col="orange",
xlab="Degree", ylab="Cumulative Frequency for in degree", xlim = c(0,100))
sprintf("mean: %f",mean(deg))
## [1] "mean: 10.378020"
sprintf("sd: %f",sd(deg))
## [1] "sd: 21.580055"
deg <- degree(as.directed(flights), mode="out")
hist(deg, main="Histogram of Total node degree",xlim=c(0,50), ylim=c(0,1500),breaks = 100)
deg.dist <- degree_distribution(flights, cumulative=T, mode="out")
plot( x=0:max(deg), y=1-deg.dist, pch=19, cex=1.2, col="orange",
xlab="Degree", ylab="Cumulative Frequency for out degree", xlim = c(0,100))
sprintf("mean: %f",mean(deg))
## [1] "mean: 10.378020"
sprintf("sd: %f",sd(deg))
## [1] "sd: 21.649405"
As we can see from the charts, our network exhibits a longtail kind of chart. This reflects the fact that many of the airports in our dataset are either really small local airports or privately owned airports. The nodes that show bigger degrees are those that are internarional airports connecting major cities together and with the rest of the world. The biggest airports tend to have a lot of edges, with even higher weights, causing the skewness in the node distribution. Since our graph is directed, it reflects the in degrees reflect the amount of flights towarrds that airport, while the out degree describes the outbound flights. #### Network Diameter Network Diameter, Average Path Length, and the clustering coeffecient without considering weights
sprintf("Flights Network diameter: %d",diameter(flights, directed=T, weights = NA))
## [1] "Flights Network diameter: 17"
print("this is the shortest path, using airport ids, from to go from id 5522 to 7340")
## [1] "this is the shortest path, using airport ids, from to go from id 5522 to 7340"
E(flights, path=get_diameter(flights))
## + 17/30501 edges from ca91f15 (vertex names):
## [1] 5522->5482 5482->5543 5543->5490 5490->91 91 ->143 143 ->133
## [7] 133 ->144 144 ->111 111 ->1382 1382->912 912 ->929 929 ->931
## [13] 931 ->5619 5619->5618 5618->5616 5616->5621 5621->7430
sprintf("Flights Network Average Path Length: %f",mean_distance(flights, directed=T))
## [1] "Flights Network Average Path Length: 4.145036"
The diameter represents the largerst shortest path in our network! that means, that means the most this is the minimus distance you have to travel, minimum number of airports you have to pass through, if youre willing to take connection flights only to go from one end to the other.
This is a plot of the diameter, or largest shortest distance
diam <- get_diameter(flights, directed=T)
source_diameter <- as.character(metadata[metadata$id == 5522, "city"])
target_diameter <-as.character(metadata[metadata$id == 5621, "city"])
sprintf("Going from %s", source_diameter)
## [1] "Going from Peawanuck"
sprintf("to %s",target_diameter)
## [1] "to Tsiroanomandidy"
sprintf("we have to pass through %s airports!",diameter(flights)-1) #since the last furthermost id does not have a matching name in our airports metadata dataset.
## [1] "we have to pass through 16 airports!"
vcol <- rep("gray40", vcount(flights))
vcol[diam] <- "gold"
ecol <- rep("gray80", ecount(flights))
ecol[E(flights, path=diam)] <- "orange"
E(flights, path=diam) # finds edges along a path, here 'diam'
## + 17/30501 edges from ca91f15 (vertex names):
## [1] 5522->5482 5482->5543 5543->5490 5490->91 91 ->143 143 ->133
## [7] 133 ->144 144 ->111 111 ->1382 1382->912 912 ->929 929 ->931
## [13] 931 ->5619 5619->5618 5618->5616 5616->5621 5621->7430
plot(flights, vertex.color=vcol, edge.color=ecol, edge.arrow.mode=0, vertex.label= NA)
Calculating the Local and Global clustering coeefecients
sprintf("Flights Network Clustering Coefficient: %f",transitivity(as.undirected(flights),type="global", weights = NA))
## [1] "Flights Network Clustering Coefficient: 0.254718"
sprintf("Flights Network Graph average local clustering coefficient: %f",mean(transitivity(as.directed(flights),type="local", weights = NA), na.rm = T))
## [1] "Flights Network Graph average local clustering coefficient: 0.123600"
deg <- degree(flights, mode="total")
btw <-betweenness(flights)
cls <-closeness(flights)
## Warning in closeness(flights): At centrality.c:2617 :closeness centrality
## is not well-defined for disconnected graphs
centrality_table <- cbind(deg, btw, cls)
centrality_table <- as.data.frame(centrality_table)
centrality_table <- setDT(centrality_table, keep.rownames = TRUE)[]
centrality_table$rn <- as.numeric(centrality_table$rn)
centrality_table <- merge(centrality_table,metadata, by.x = "rn",by.y = "id", all = T)
centrality_table[order(centrality_table$deg, decreasing = T),][1:20]
## rn deg btw cls
## 1: 340 473 609655.81 6.186587e-06
## 2: 1382 426 390233.09 6.166559e-06
## 3: 580 395 358465.59 6.172420e-06
## 4: 3364 340 253341.16 6.155627e-06
## 5: 502 339 158156.60 6.147604e-06
## 6: 1701 338 262874.91 6.169640e-06
## 7: 3682 335 317910.95 6.163556e-06
## 8: 2188 331 454134.18 6.176004e-06
## 9: 4029 317 315105.61 6.153657e-06
## 10: 507 315 222811.26 6.155551e-06
## 11: 1229 311 234044.76 6.163632e-06
## 12: 1555 307 182097.58 6.165152e-06
## 13: 346 297 116526.20 6.164886e-06
## 14: 1218 284 119789.97 6.161012e-06
## 15: 3797 282 276848.04 6.172116e-06
## 16: 548 277 86059.93 6.120363e-06
## 17: 3877 277 228709.89 6.125987e-06
## 18: 3885 264 283263.81 6.150062e-06
## 19: 599 260 73546.93 6.140243e-06
## 20: 3406 252 175861.27 6.145828e-06
## name city
## 1: Frankfurt am Main Airport Frankfurt
## 2: Charles de Gaulle International Airport Paris
## 3: Amsterdam Airport Schiphol Amsterdam
## 4: Beijing Capital International Airport Beijing
## 5: London Gatwick Airport London
## 6: Atatürk International Airport Istanbul
## 7: Hartsfield Jackson Atlanta International Airport Atlanta
## 8: Dubai International Airport Dubai
## 9: Domodedovo International Airport Moscow
## 10: London Heathrow Airport London
## 11: Adolfo Suárez Madridâ\200“Barajas Airport Madrid
## 12: Leonardo da Vinciâ\200“Fiumicino Airport Rome
## 13: Munich Airport Munich
## 14: Barcelona International Airport Barcelona
## 15: John F Kennedy International Airport New York
## 16: London Stansted Airport London
## 17: McCarran International Airport Las Vegas
## 18: Suvarnabhumi Airport Bangkok
## 19: Dublin Airport Dublin
## 20: Shanghai Pudong International Airport Shanghai
## country IATA ICAO Latitude Longitude Altitude Timezone
## 1: Germany FRA EDDF 50.03333 8.570556 364 1
## 2: France CDG LFPG 49.01280 2.550000 392 1
## 3: Netherlands AMS EHAM 52.30860 4.763890 -11 1
## 4: China PEK ZBAA 40.08010 116.584999 116 8
## 5: United Kingdom LGW EGKK 51.14810 -0.190278 202 0
## 6: Turkey ISL LTBA 40.97690 28.814600 163 3
## 7: United States ATL KATL 33.63670 -84.428101 1026 -5
## 8: United Arab Emirates DXB OMDB 25.25280 55.364399 62 4
## 9: Russia DME UUDD 55.40880 37.906300 588 3
## 10: United Kingdom LHR EGLL 51.47060 -0.461941 83 0
## 11: Spain MAD LEMD 40.47193 -3.562640 1998 1
## 12: Italy FCO LIRF 41.80028 12.238889 13 1
## 13: Germany MUC EDDM 48.35380 11.786100 1487 1
## 14: Spain BCN LEBL 41.29710 2.078460 12 1
## 15: United States JFK KJFK 40.63980 -73.778900 13 -5
## 16: United Kingdom STN EGSS 51.88500 0.235000 348 0
## 17: United States LAS KLAS 36.08010 -115.152000 2181 -8
## 18: Thailand BKK VTBS 13.68110 100.747002 5 7
## 19: Ireland DUB EIDW 53.42130 -6.270070 242 0
## 20: China PVG ZSPD 31.14340 121.805000 13 8
## DST TZ Type Source
## 1: E Europe/Berlin airport OurAirports
## 2: E Europe/Paris airport OurAirports
## 3: E Europe/Amsterdam airport OurAirports
## 4: U Asia/Shanghai airport OurAirports
## 5: E Europe/London airport OurAirports
## 6: E Europe/Istanbul airport OurAirports
## 7: A America/New_York airport OurAirports
## 8: U Asia/Dubai airport OurAirports
## 9: N Europe/Moscow airport OurAirports
## 10: E Europe/London airport OurAirports
## 11: E Europe/Madrid airport OurAirports
## 12: E Europe/Rome airport OurAirports
## 13: E Europe/Berlin airport OurAirports
## 14: E Europe/Madrid airport OurAirports
## 15: A America/New_York airport OurAirports
## 16: E Europe/London airport OurAirports
## 17: A America/Los_Angeles airport OurAirports
## 18: U Asia/Bangkok airport OurAirports
## 19: E Europe/Dublin airport OurAirports
## 20: U Asia/Shanghai airport OurAirports
### Ranking Top 20 nodes based with highest Betweeness Centrality
centrality_table[order(centrality_table$btw, decreasing = T),][1:30]
## rn deg btw cls
## 1: 340 473 609655.8 6.186587e-06
## 2: 3774 64 465741.4 6.090431e-06
## 3: 2188 331 454134.2 6.176004e-06
## 4: 1382 426 390233.1 6.166559e-06
## 5: 2279 171 368500.3 6.157787e-06
## 6: 580 395 358465.6 6.172420e-06
## 7: 193 234 331555.7 6.169640e-06
## 8: 146 137 324197.2 6.143978e-06
## 9: 3682 335 317910.9 6.163556e-06
## 10: 4029 317 315105.6 6.153657e-06
## 11: 3577 127 294856.7 6.140771e-06
## 12: 3484 234 290454.8 6.162758e-06
## 13: 3885 264 283263.8 6.150062e-06
## 14: 3304 213 280042.8 6.144091e-06
## 15: 3797 282 276848.0 6.172116e-06
## 16: 1701 338 262874.9 6.169640e-06
## 17: 3364 340 253341.2 6.155627e-06
## 18: 3316 225 241419.5 6.140696e-06
## 19: 2241 183 234138.4 6.170440e-06
## 20: 1229 311 234044.8 6.163632e-06
## 21: 3320 101 232511.9 6.108661e-06
## 22: 2564 164 231007.1 6.148360e-06
## 23: 3877 277 228709.9 6.125987e-06
## 24: 5 63 228015.4 6.079323e-06
## 25: 507 315 222811.3 6.155551e-06
## 26: 3494 215 218900.1 6.167357e-06
## 27: 3361 154 217052.9 6.120026e-06
## 28: 2709 134 199058.2 6.121262e-06
## 29: 2276 153 188945.1 6.138434e-06
## 30: 3930 236 186769.6 6.149494e-06
## rn deg btw cls
## name
## 1: Frankfurt am Main Airport
## 2: Ted Stevens Anchorage International Airport
## 3: Dubai International Airport
## 4: Charles de Gaulle International Airport
## 5: Narita International Airport
## 6: Amsterdam Airport Schiphol
## 7: Lester B. Pearson International Airport
## 8: Montreal / Pierre Elliott Trudeau International Airport
## 9: Hartsfield Jackson Atlanta International Airport
## 10: Domodedovo International Airport
## 11: Seattle Tacoma International Airport
## 12: Los Angeles International Airport
## 13: Suvarnabhumi Airport
## 14: Kuala Lumpur International Airport
## 15: John F Kennedy International Airport
## 16: Atatürk International Airport
## 17: Beijing Capital International Airport
## 18: Singapore Changi Airport
## 19: Doha International Airport
## 20: Adolfo Suárez Madridâ\200“Barajas Airport
## 21: Brisbane International Airport
## 22: Guarulhos - Governador André Franco Montoro International Airport
## 23: McCarran International Airport
## 24: Port Moresby Jacksons International Airport
## 25: London Heathrow Airport
## 26: Newark Liberty International Airport
## 27: Sydney Kingsford Smith International Airport
## 28: El Dorado International Airport
## 29: Taiwan Taoyuan International Airport
## 30: Incheon International Airport
## name
## city country IATA ICAO Latitude Longitude
## 1: Frankfurt Germany FRA EDDF 50.03333 8.570556
## 2: Anchorage United States ANC PANC 61.17440 -149.996002
## 3: Dubai United Arab Emirates DXB OMDB 25.25280 55.364399
## 4: Paris France CDG LFPG 49.01280 2.550000
## 5: Tokyo Japan NRT RJAA 35.76470 140.386002
## 6: Amsterdam Netherlands AMS EHAM 52.30860 4.763890
## 7: Toronto Canada YYZ CYYZ 43.67720 -79.630600
## 8: Montreal Canada YUL CYUL 45.47060 -73.740799
## 9: Atlanta United States ATL KATL 33.63670 -84.428101
## 10: Moscow Russia DME UUDD 55.40880 37.906300
## 11: Seattle United States SEA KSEA 47.44900 -122.308998
## 12: Los Angeles United States LAX KLAX 33.94250 -118.407997
## 13: Bangkok Thailand BKK VTBS 13.68110 100.747002
## 14: Kuala Lumpur Malaysia KUL WMKK 2.74558 101.709999
## 15: New York United States JFK KJFK 40.63980 -73.778900
## 16: Istanbul Turkey ISL LTBA 40.97690 28.814600
## 17: Beijing China PEK ZBAA 40.08010 116.584999
## 18: Singapore Singapore SIN WSSS 1.35019 103.994003
## 19: Doha Qatar DIA OTBD 25.26110 51.565102
## 20: Madrid Spain MAD LEMD 40.47193 -3.562640
## 21: Brisbane Australia BNE YBBN -27.38420 153.117004
## 22: Sao Paulo Brazil GRU SBGR -23.43556 -46.473057
## 23: Las Vegas United States LAS KLAS 36.08010 -115.152000
## 24: Port Moresby Papua New Guinea POM AYPY -9.44338 147.220001
## 25: London United Kingdom LHR EGLL 51.47060 -0.461941
## 26: Newark United States EWR KEWR 40.69250 -74.168701
## 27: Sydney Australia SYD YSSY -33.94610 151.177002
## 28: Bogota Colombia BOG SKBO 4.70159 -74.146900
## 29: Taipei Taiwan TPE RCTP 25.07770 121.233002
## 30: Seoul South Korea ICN RKSI 37.46910 126.450996
## city country IATA ICAO Latitude Longitude
## Altitude Timezone DST TZ Type Source
## 1: 364 1 E Europe/Berlin airport OurAirports
## 2: 152 -9 A America/Anchorage airport OurAirports
## 3: 62 4 U Asia/Dubai airport OurAirports
## 4: 392 1 E Europe/Paris airport OurAirports
## 5: 141 9 U Asia/Tokyo airport OurAirports
## 6: -11 1 E Europe/Amsterdam airport OurAirports
## 7: 569 -5 A America/Toronto airport OurAirports
## 8: 118 -5 A America/Toronto airport OurAirports
## 9: 1026 -5 A America/New_York airport OurAirports
## 10: 588 3 N Europe/Moscow airport OurAirports
## 11: 433 -8 A America/Los_Angeles airport OurAirports
## 12: 125 -8 A America/Los_Angeles airport OurAirports
## 13: 5 7 U Asia/Bangkok airport OurAirports
## 14: 69 8 N Asia/Kuala_Lumpur airport OurAirports
## 15: 13 -5 A America/New_York airport OurAirports
## 16: 163 3 E Europe/Istanbul airport OurAirports
## 17: 116 8 U Asia/Shanghai airport OurAirports
## 18: 22 8 N Asia/Singapore airport OurAirports
## 19: 35 3 U Asia/Qatar airport OurAirports
## 20: 1998 1 E Europe/Madrid airport OurAirports
## 21: 13 10 N Australia/Brisbane airport OurAirports
## 22: 2459 -3 S America/Sao_Paulo airport OurAirports
## 23: 2181 -8 A America/Los_Angeles airport OurAirports
## 24: 146 10 U Pacific/Port_Moresby airport OurAirports
## 25: 83 0 E Europe/London airport OurAirports
## 26: 18 -5 A America/New_York airport OurAirports
## 27: 21 10 O Australia/Sydney airport OurAirports
## 28: 8361 -5 U America/Bogota airport OurAirports
## 29: 106 8 U Asia/Taipei airport OurAirports
## 30: 23 9 U Asia/Seoul airport OurAirports
## Altitude Timezone DST TZ Type Source
Gephi Graphs
The interesting point taken in this chart is Anchorage, Alaska. Although geographically one would think that maybe this has high betweeness as it acts a bridge between the far east and the west coast of the USA, this is not the case. After further analysis, this node acts as a bridge connecting all Alaskan airports and some Canadian airports to the rest of the USA, and therefore the rest of the world.
centrality_table[order(centrality_table$cls, decreasing = T),][1:30]
## rn deg btw cls
## 1: 340 473 609655.81 6.186587e-06
## 2: 2188 331 454134.18 6.176004e-06
## 3: 580 395 358465.59 6.172420e-06
## 4: 3797 282 276848.04 6.172116e-06
## 5: 2241 183 234138.40 6.170440e-06
## 6: 193 234 331555.67 6.169640e-06
## 7: 1701 338 262874.91 6.169640e-06
## 8: 3494 215 218900.10 6.167357e-06
## 9: 1382 426 390233.09 6.166559e-06
## 10: 1678 245 113311.03 6.165380e-06
## 11: 1555 307 182097.58 6.165152e-06
## 12: 346 297 116526.20 6.164886e-06
## 13: 1229 311 234044.76 6.163632e-06
## 14: 3714 139 125238.64 6.163632e-06
## 15: 3682 335 317910.95 6.163556e-06
## 16: 3484 234 290454.82 6.162758e-06
## 17: 1218 284 119789.97 6.161012e-06
## 18: 4353 1 0.00 6.158394e-06
## 19: 2279 171 368500.29 6.157787e-06
## 20: 1524 222 84419.06 6.157332e-06
## 21: 2985 235 153089.70 6.156422e-06
## 22: 3364 340 253341.16 6.155627e-06
## 23: 507 315 222811.26 6.155551e-06
## 24: 2179 148 83773.71 6.154793e-06
## 25: 4029 317 315105.61 6.153657e-06
## 26: 345 241 63925.95 6.153240e-06
## 27: 302 226 89208.89 6.152029e-06
## 28: 609 228 163821.18 6.151840e-06
## 29: 3670 212 122915.02 6.151650e-06
## 30: 478 241 110645.07 6.150402e-06
## rn deg btw cls
## name city
## 1: Frankfurt am Main Airport Frankfurt
## 2: Dubai International Airport Dubai
## 3: Amsterdam Airport Schiphol Amsterdam
## 4: John F Kennedy International Airport New York
## 5: Doha International Airport Doha
## 6: Lester B. Pearson International Airport Toronto
## 7: Atatürk International Airport Istanbul
## 8: Newark Liberty International Airport Newark
## 9: Charles de Gaulle International Airport Paris
## 10: Zürich Airport Zurich
## 11: Leonardo da Vinciâ\200“Fiumicino Airport Rome
## 12: Munich Airport Munich
## 13: Adolfo Suárez Madridâ\200“Barajas Airport Madrid
## 14: Washington Dulles International Airport Washington
## 15: Hartsfield Jackson Atlanta International Airport Atlanta
## 16: Los Angeles International Airport Los Angeles
## 17: Barcelona International Airport Barcelona
## 18: Anapa Vityazevo Airport Anapa
## 19: Narita International Airport Tokyo
## 20: Malpensa International Airport Milano
## 21: Sheremetyevo International Airport Moscow
## 22: Beijing Capital International Airport Beijing
## 23: London Heathrow Airport London
## 24: Abu Dhabi International Airport Abu Dhabi
## 25: Domodedovo International Airport Moscow
## 26: Düsseldorf Airport Duesseldorf
## 27: Brussels Airport Brussels
## 28: Copenhagen Kastrup Airport Copenhagen
## 29: Dallas Fort Worth International Airport Dallas-Fort Worth
## 30: Manchester Airport Manchester
## name city
## country IATA ICAO Latitude Longitude Altitude Timezone
## 1: Germany FRA EDDF 50.03333 8.570556 364 1
## 2: United Arab Emirates DXB OMDB 25.25280 55.364399 62 4
## 3: Netherlands AMS EHAM 52.30860 4.763890 -11 1
## 4: United States JFK KJFK 40.63980 -73.778900 13 -5
## 5: Qatar DIA OTBD 25.26110 51.565102 35 3
## 6: Canada YYZ CYYZ 43.67720 -79.630600 569 -5
## 7: Turkey ISL LTBA 40.97690 28.814600 163 3
## 8: United States EWR KEWR 40.69250 -74.168701 18 -5
## 9: France CDG LFPG 49.01280 2.550000 392 1
## 10: Switzerland ZRH LSZH 47.46470 8.549170 1416 1
## 11: Italy FCO LIRF 41.80028 12.238889 13 1
## 12: Germany MUC EDDM 48.35380 11.786100 1487 1
## 13: Spain MAD LEMD 40.47193 -3.562640 1998 1
## 14: United States IAD KIAD 38.94450 -77.455803 312 -5
## 15: United States ATL KATL 33.63670 -84.428101 1026 -5
## 16: United States LAX KLAX 33.94250 -118.407997 125 -8
## 17: Spain BCN LEBL 41.29710 2.078460 12 1
## 18: Russia AAQ URKA 45.00210 37.347301 174 3
## 19: Japan NRT RJAA 35.76470 140.386002 141 9
## 20: Italy MXP LIMC 45.63060 8.728110 768 1
## 21: Russia SVO UUEE 55.97260 37.414600 622 3
## 22: China PEK ZBAA 40.08010 116.584999 116 8
## 23: United Kingdom LHR EGLL 51.47060 -0.461941 83 0
## 24: United Arab Emirates AUH OMAA 24.43300 54.651100 88 4
## 25: Russia DME UUDD 55.40880 37.906300 588 3
## 26: Germany DUS EDDL 51.28950 6.766780 147 1
## 27: Belgium BRU EBBR 50.90140 4.484440 184 1
## 28: Denmark CPH EKCH 55.61790 12.656000 17 1
## 29: United States DFW KDFW 32.89680 -97.038002 607 -6
## 30: United Kingdom MAN EGCC 53.35370 -2.274950 257 0
## country IATA ICAO Latitude Longitude Altitude Timezone
## DST TZ Type Source
## 1: E Europe/Berlin airport OurAirports
## 2: U Asia/Dubai airport OurAirports
## 3: E Europe/Amsterdam airport OurAirports
## 4: A America/New_York airport OurAirports
## 5: U Asia/Qatar airport OurAirports
## 6: A America/Toronto airport OurAirports
## 7: E Europe/Istanbul airport OurAirports
## 8: A America/New_York airport OurAirports
## 9: E Europe/Paris airport OurAirports
## 10: E Europe/Zurich airport OurAirports
## 11: E Europe/Rome airport OurAirports
## 12: E Europe/Berlin airport OurAirports
## 13: E Europe/Madrid airport OurAirports
## 14: A America/New_York airport OurAirports
## 15: A America/New_York airport OurAirports
## 16: A America/Los_Angeles airport OurAirports
## 17: E Europe/Madrid airport OurAirports
## 18: N Europe/Moscow airport OurAirports
## 19: U Asia/Tokyo airport OurAirports
## 20: E Europe/Rome airport OurAirports
## 21: N Europe/Moscow airport OurAirports
## 22: U Asia/Shanghai airport OurAirports
## 23: E Europe/London airport OurAirports
## 24: U Asia/Dubai airport OurAirports
## 25: N Europe/Moscow airport OurAirports
## 26: E Europe/Berlin airport OurAirports
## 27: E Europe/Brussels airport OurAirports
## 28: E Europe/Copenhagen airport OurAirports
## 29: A America/Chicago airport OurAirports
## 30: E Europe/London airport OurAirports
## DST TZ Type Source
However, here we decided to look further into the cities, and decided to join arirports from the same cities.
citiesgraph <- graph.data.frame(fread("citiesgraph.csv"))
cls <-closeness(citiesgraph)
## Warning in closeness(citiesgraph): At centrality.c:2784 :closeness
## centrality is not well-defined for disconnected graphs
sort(cls, decreasing = T)[1:20]
## London Frankfurt Koumac Paris Amsterdam
## 9.905698e-06 9.892273e-06 9.891784e-06 9.888947e-06 9.869622e-06
## New York Dubai Rome Toronto Los Angeles
## 9.869525e-06 9.867674e-06 9.853576e-06 9.850664e-06 9.849209e-06
## Istanbul Anapa Bangkok Tokyo Moscow
## 9.846299e-06 9.844264e-06 9.842132e-06 9.841648e-06 9.840970e-06
## Beijing Munich Seoul Atlanta Zurich
## 9.839033e-06 9.837872e-06 9.837485e-06 9.836904e-06 9.835066e-06
Gephi Graphs
Now after joining cities together, we see that London overtook Frankfurt. This makes semse, as frankfurt has only 2 airports, and is not even the capital of the country. 2 major world airports are not far away in the same country (Munich and Berlin). whereas London has around 5 airports for internal flights within the UK, low cost flights within Europe.
#Clustering Community Detection
When we clustered using modulariy coeficient on Gephi, we the following clusters. We then decided to see the geographical location of the nodes represented in theses clusters to be able to visualize and understand fully the results.
Gephi Graphs
Not surprisingle, the clusters where close to each other geographically. This makes complete sense, since airports with the same continents tend to be more connected to each other. Although we have some countries belonging to different continents in the same cluster, this reinforces the connections between the airports in those countries. The most obvious examples are the ones of the northern countries of south america. The cluster points out that Venezuela is might be well connected to US states. Other isolates such as Canada and Alaska reinforce the idea of communities within each other. The clusters can be labeled as North America, South America, Europe, Africa, Middle East India, Far Easr Asia, Oceania, Central Asia.
It is easy to distinguish clusters when they are spreaded across the map, however, lets look at them now without using the geo-layout plugin.
The clusters here appear in a different way. We can deduce that South America, Africa and the Middle East, are connected to the rest of the world by few hubs like Sao Paolo, Cairo and Marakesh, Dubai respectively. Those are the nodes or cities in that case that connects these geogrpahical locations to other areas. Whereas we can see that Europel, lying there in the middle is more attached to the entire world! As we analyzed before that nodes with highest degrees and many of thise in the top of the betweenness table belong to Europe. Geographicall speaking, it also makes sense. Europe is closer to the 6 all the main regions than other areas around the globe.
Calculate the clusters using louvain algorithm.
cl <- cluster_louvain(as.undirected(flights))
plot(cl, as.undirected(flights), vertex.label =NA)
modularity(cl)
## [1] 0.6535182
cfg <- cluster_fast_greedy(as.undirected(flights))
plot(cfg, as.undirected(flights), vertex.label = NA)
modularity(cfg)
## [1] 0.6049943
Community detection based on based on propagating labels Assigns node labels, randomizes, than replaces each vertex’s label with the label that appears most frequently among neighbors. Those steps are repeated until each vertex has the most common label of its neighbors.
clp <- cluster_label_prop(as.undirected(citiesgraph))
plot(clp, as.undirected(citiesgraph), vertex.label = NA)
colrs <- adjustcolor( c("gray50", "tomato", "gold", "yellowgreen"), alpha=.6)
kc <- coreness(flights, mode="all")
plot(flights, vertex.size=kc*6, vertex.label=kc, vertex.color=colrs[kc], vertex.label = NA)
LouvainCluster <- cluster_louvain(as.undirected(flights))
plot(LouvainCluster, as.undirected(flights))
#Community detection based on greedy optimization of modularity
cfg <- cluster_fast_greedy(as.undirected(flights))
plot(cfg, as.undirected(flights), vertex.labels = NA)
Airports have huge impact on the countries economy. It brings in Revenues from airport taxes and from commerical spendings inside the airport shops. We want to implement our insights to help take advantage of the benefits and maybe help cities grow and maximixe their potential Revenues and economic impacts by giving suggestions on how to leverage this graph Network.
First, let us understand the impact of airports on the economy by looking at numbers.
We see that in 2018, Spain lied second by revenues generated from airports. However we believe that this is not enough. Spain is not living up to its potential.
Number of Passengers
We would like to show our support for building another airport here in Madrid, do that Spain to be part of this list on the upcoming years. From our experience as citizrns of Madrid, the Barajas Airport is relatively expensive. Flights to major world cities, specially neighbooring cities are relatively more expensive here in Madrid, when compared to other major European cities, and even other Spanish Airports.
We believe, if Spain successfully leverages its political connections and uses its geographical location to its advantage, it can make Madrid a bigger hub for travelling passengers. Politically, Spain has very strong relationships with South American countries. Geographically, Spain is the closest European country to Africa. As we saw previously in our charts, Africa and South America are not as strongly connected to the rest of the world, in comparison to other continents.
Building a new airport can free up Barajas and transform into a bigger betweenness hub. A new airport focused on cheap local and regional flights has many advantages. The amount invested in building a new airport is huge enough to boost the Madrid economy. Several job oportunites will be available at the disposal of the people of Madrid. Increased local and regional toursim plus increased airport taxes will have a positive impact on the entire city in general. Now, Barajas can be transformed into an International Betweenness airport, catching up with neighbouring cities such as Frankfurt, Paris, and London.
SOCIAL NETWORKS ANALYSIS
Profesor: ALVARO ROMERO MIRALLES
Program: MBD April intake
Group Final Assignment
Team A
The following dataset is going to be used: * OpenFlights dataset - directed graph - : “The data is downloaded from Openflights.org. a directed network containing flights between airports of the world, in which directed edge represents a flight from one airport to another from the year 2010. Here it has 2939 nodes and 30501 edges. As such, it gives much more of a complete picture and avoids the sample selection. The weights in this network refer to the number of routes between two airports.” http://opsahl.co.uk/tnet/datasets/openflights.dl
metadata downloaded from: https://openflights.org/data.html https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports.dat
some of the airport ids are missing in our data. we assume that these airports can be some priavetely owned airports or military bases, and therefore not included in the airports dataset.
We had to download the data and edit the headers for us R to be able to read the read_graph( “http://opsahl.co.uk/tnet/datasets/openflights.dl”,format = c(“dl”)) was giving us an error. as for the metadata, for readablilty we changed file to csv.
both edited files are in the downloaded folder